Beyond Logarithmic Bounds in Online Learning

نویسندگان

  • Francesco Orabona
  • Nicolò Cesa-Bianchi
  • Claudio Gentile
چکیده

We prove logarithmic regret bounds that depend on the loss LT of the competitor rather than on the number T of time steps. In the general online convex optimization setting, our bounds hold for any smooth and exp-concave loss (such as the square loss or the logistic loss). This bridges the gap between theO(lnT ) regret exhibited by expconcave losses and the O( √ LT ) regret exhibited by smooth losses. We also show that these bounds are tight for specific losses, thus they cannot be improved in general. For online regression with square loss, our analysis can be used to derive a sparse randomized variant of the online Newton step, whose expected number of updates scales with the algorithm’s loss. For online classification, we prove the first logarithmic mistake bounds that do not rely on prior knowledge of a bound on the competitor’s norm.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Logarithmic Online Regret Bounds for Undiscounted Reinforcement Learning

We present a learning algorithm for undiscounted reinforcement learning. Our interest lies in bounds for the algorithm’s online performance after some finite number of steps. In the spirit of similar methods already successfully applied for the exploration-exploitation tradeoff in multi-armed bandit problems, we use upper confidence bounds to show that our UCRL algorithm achieves logarithmic on...

متن کامل

On Equivalence of Martingale Tail Bounds and Deterministic Regret Inequalities

We study an equivalence of (i) deterministic pathwise statements appearing in the online learning literature (termed regret bounds), (ii) high-probability tail bounds for the supremum of a collection of martingales (of a specific form arising from uniform laws of large numbers for martingales), and (iii) in-expectation bounds for the supremum. By virtue of the equivalence, we prove exponential ...

متن کامل

Optimal convex combinations bounds of centrodial and harmonic means for logarithmic and identric means

We find the greatest values $alpha_{1} $ and $alpha_{2} $, and the least values $beta_{1} $ and $beta_{2} $ such that the inequalities $alpha_{1} C(a,b)+(1-alpha_{1} )H(a,b)

متن کامل

(Nearly) Optimal Algorithms for Private Online Learning in Full-information and Bandit Settings

We give differentially private algorithms for a large class of online learning algorithms, in both the full information and bandit settings. Our algorithms aim to minimize a convex loss function which is a sum of smaller convex loss terms, one for each data point. To design our algorithms, we modify the popular mirror descent approach, or rather a variant called follow the approximate leader. T...

متن کامل

A new look at shifting regret

We investigate extensions of well-known online learning algorithms such as fixed-share of Herbster and Warmuth (1998) or the methods proposed by Bousquet and Warmuth (2002). These algorithms use weight sharing schemes to perform as well as the best sequence of experts with a limited number of changes. Here we show, with a common, general, and simpler analysis, that weight sharing in fact achiev...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012